Reproducibility of APT-weighted CEST-MRI at 3T in healthy brain and tumor across sessions and scanners

Amide proton transfer (APT)-weighted chemical exchange saturation transfer (CEST) imaging is a recent MRI technique making its way into clinical application. In this work, we investigated whether APT-weighted CEST imaging can provide reproducible measurements across scan sessions and scanners. Within-session, between-session and between scanner reproducibility was calculated for 19 healthy volunteers and 7 patients with a brain tumor on two 3T MRI scanners. The APT-weighted CEST effect was evaluated by calculating the Lorentzian Difference (LD), magnetization transfer ratio asymmetry (MTRasym), and relaxation-compensated inverse magnetization transfer ratio (MTRREX) averaged in whole brain white matter (WM), enhancing tumor and necrosis. Within subject coefficient of variation (COV) calculations, Bland–Altman plots and mixed effect modeling were performed to assess the repeatability and reproducibility of averaged values. The group median COVs of LD APT were 0.56% (N = 19), 0.84% (N = 6), 0.80% (N = 9) in WM within-session, between-session and between-scanner respectively. The between-session COV of LD APT in enhancing tumor (N = 6) and necrotic core (N = 3) were 4.57% and 5.67%, respectively. There were no significant differences in within session, between session and between scanner comparisons of the APT effect. The COVs of LD and MTRREX were consistently lower than MTRasym in all experiments, both in healthy tissues and tumor. The repeatability and reproducibility of APT-weighted CEST was clinically acceptable across scan sessions and scanners. Although MTRasym is simple to acquire and compute and sufficient to provide robust measurement, it is beneficial to include LD and MTRREX to obtain higher reproducibility for detecting minor signal difference in different tissue types.


MRI experiment
A 3 Tesla MRI scanner equipped with a 32-channel head coil (MR750, General Electric, Chicago, USA) was used for the within/between session comparisons in healthy volunteers.A 3 Tesla SIGNA PET-MRI scanner with a 24-channel head coil (General Electric, Chicago, USA) was used to assess between-scanner reproducibility in healthy volunteers and between session reproducibility in patients.One scan session contained at minimum a T 1 -weighted structural scan and 2 identical CEST scans.The total scan duration of one session was approximately 15 min.
The design of this study is presented in Fig. 1.To assess within-session repeatability, each volunteer (N = 19) underwent one scan session including two identical CEST scans.To assess between-session reproducibility, six volunteers underwent the same session one week after the first session was acquired.To assess between scanner reproducibility, we applied one scan session per scanner for each nine volunteer on the same day.For the patients (N = 7), only between session reproducibility (on the same scanner) was assessed.For patients, the median time interval between 1.b.1 and 2.b.1 was 4 days.

Image acquisition
The pulse sequences used for the following imaging acquisition were identical for both systems included.A 3D snapshot CEST image acquisition 14 was used with the following parameters: slice thickness = 3 mm, 14 slices, in-plane resolution 1.7 × 1.7 mm 2 , matrix size = 128 × 104, read out flip angle 6°, ASSET acceleration factor of 3. The field of view was manually placed with the top slices 20 mm above the corpus callosum for suitable tissue separation of white and grey matter (WM respectively GM).Saturation was performed with B 1,RMS = 1.5 µT and with 80 Gaussian shaped pulses of 20 ms with 50% duty cycle.Z-spectra were obtained for 43 frequency offsets distributed between − 100 and 100 ppm, relative to the water resonance set to 0 ppm (at ± 100 ppm, ± 50 to ± 10 ppm in steps of 10 ppm, ± 9 to ± 5 ppm in steps of 1 ppm, ± 4.5 to ± 1 in steps of 0.5 ppm, and − 0.5 to 0.5 ppm in steps of 0.25 ppm).Four images were obtained with saturation pulses at − 300 ppm, and the last of four images were selected as the reference images, yielding a total time of 4:30 min for each CEST scan.

Data analysis
Image analysis was done with in-house written Matlab scripts (R2021a, The MathWorks, Natick, USA) 33 and the freely available FMRIB Software Library (FSL 5.0, Oxford, UK) 34,35 .The CEST contrast maps of the brain were generated based on routines described in Wu et al. 15 .Z-spectra were calculated from saturated CEST images normalized by the reference image.In each voxel, two-pool Lorentzian fitting was performed to fit the direct water saturation (DS) and MT effect to the Z-spectra.LD was computed to evaluate the CEST effect by subtracting the Z-spectra from the fitted Lorentzian function and LD at 3.5 ppm was used for APT-weighted imaging.
After that, the shift between the minimum value of Lorentzian fitting and 0 ppm was recorded to create the B0 inhomogeneity map.This map was applied on Z-spectra and LD for voxel-wise B 0 correction to compensate for local field inhomogeneity.Subsequently, MTR asym and MTR REX 23,24 were computed for APT at + 3.5 ppm.In the healthy volunteers, WM, GM and the cerebrospinal fluid in the lateral ventricles (CSF) were selected as regions of interest (ROI).These ROIs were segmented on the high resolution T 1 -weighted structural scans by 'FAST' , available within the free online software FMRIB Software Library (FSL) v6.0 34,35 .In the patients, the contrast enhancing area of tumor (CE), the area(s) encompassed by the enhancement (necrotic core), and contralateral healthy WM were selected as ROIs.Tumors were segmented using an in-house segmentation pipeline.First, pre-, and post-contrast T 1 -weighted, T2-weighted and FLAIR scans were rigidly groupwise registered to the post-contrast T1-weighted space.7][38] .Automatic segmentation was then performed using HD-GLIO 39,40 , nnUNet task 1 and 82, and an extended version of nnUNet 40,41 .HD-GLIO is a segmentation algorithm specifically designed for enhancing glioma and is available at https:// github.com/ Neuro AI-HD/ HD-GLIO.Segmentation predictions were combined using the multi-label STAPLE algorithm 42 .The segmentations of the tumor contrast enhancing (CE) area and necrotic core were visually inspected and manually corrected if needed, using ITK-SNAP version 3.6.0(University of Pennsylvania and Utah, USA) 43 .To register ROIs generated in the T 1 -weighted-space of a participant to the CEST space, linear transformations were performed by registering the CEST image acquired at 6 ppm into T1-weighted space with 'FLIRT' , available within FSL.The inverse of this transformation was used to register the ROIs from the T 1 -weighed-space to the CEST space.

Statistical analysis
Per participant and per ROI, the average APT-weighted LD, MTR asym and MTR REX were calculated.To assess within-session repeatability and between-session and between scanner reproducibility, the coefficient of variation (COV) and Bland-Altman plots were generated.
The calculation of COV was based on previous methods 44 .In each participant, the COV was calculated by dividing the standard deviation by the absolute mean of each APT-weighted metric per ROI across the different measurements: within session, between sessions, between scanners, and all sessions.The equations used for these calculations are given in the Appendix.Unless otherwise stated, the group median and interquartile ranges [Q1-Q3] for the COV are reported.www.nature.com/scientificreports/Bland-Altman plots were created by plotting the ROI averages against the differences between the two measurements per participant used to assess within-session repeatability and between-session and between-scanner reproducibility for each CEST metric.
To test whether there were any significant differences between APT-weighted CEST measurements at different moments/scanners, linear mixed effect models were applied to investigate the effects of within session, between sessions and between scanners variation on the CEST measurements.
Statistical analysis was performed with R studio v2022.2.1.461 45and Microsoft Excel 2016.The level of statistical significance was set at p < 0.05.

Ethical approval
All procedures performed in studies involving human participants were approved by the medical ethics committee of the Erasmus MC and in accordance with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

Informed consent
Written informed consent was obtained from all subjects in this study.

Healthy volunteers
The COVs are shown in Table 2.The within-session, group median COVs of LD and MTR REX APT in WM (N = 19) were 0.56 [0.20-1.01]%and 0.84 [0.38-1.27]%respectively.The within-session, group median COV was larger and had larger interquartile range for APT evaluated by MTR asym (2.62 [0.94-7.08]% in WM).Across scan sessions, LD APT and MTR REX APT showed consistently low COVs within session, between session and between scanner, over all 4 scans.The COVs were larger for MTR asym compared for those of LD and MTR REX , and for all three metrics COVs were mostly lower for within-session measurements compared to measurements between sessions and scanners.
The Bland-Altman plots of LD APT, MTR asym and MTR REX APT in WM showed similar standard deviations for the within-session and between-session analyses (Fig. 2).Both MTR asym and MTR REX APT showed larger standard deviations for between-scanner than for within-session measurements and between-session measurements.
The mixed effects analysis showed no significant effect of within-session, between-session or between-scanner variation on APT-weighted CEST measurement evaluated by either three metrics in the different ROIs.www.nature.com/scientificreports/

Patients
An example of the APT effect visualized with different metrics across two sessions can be seen in Fig. 3. From visual inspection, LD and MTR asym consistently showed hyperintensity in the region of (enhancing) tumor compared with contralateral healthy tissue across two scan sessions, while for MTR REX hypointensity can be observed consistently in the tumor ROI compared with healthy tissue.The APT effect in two scan sessions and COVs for the CE tumor and tumor core are shown in Table 3. LD APT and MTR REX showed lower group median COV (4.57%, 3.89% respectively) than MTR asym (9.20%) in tumor CE.The Bland-Altman plots show the deviations of APT-weighted CEST in tumor CE and tumor necrotic core measured by three metrics across two sessions (Fig. 4).The result of patient 3 was discarded because the size of tumor was too small for the CEST imaging spatial resolution, such that required downsampling during registration to CEST space, no voxels from the tumor mask remained.Patients 4,6 and 7 included in our study had a very small necrotic core region, such that the area(s) were too small to register into CEST space and could thus not be assessed separately.

Discussion
In this work, we evaluated the repeatability and reproducibility of APT-weighted CEST imaging at 3 Tesla.This study was performed within and between sessions, and between two different scanners.The repeatability of the APT effect within a session was consistently better than the reproducibility between sessions and between scanners.The majority of COV values in our study were < 30%.In the comparison across three CEST metrics, LD and MTR REX provided more robust measurement than MTR asym with COV < 10%, both in healthy volunteers and patients, as illustrated by smaller COVs.
To interpret COV, we refer to a grading scheme introduced previously for hepatic perfusion imaging biomarkers 46 where COV < 10% is considered very good, 10% < COV < 20% as good and 20% < COV < 30% as Table 3. Group median [Q1-Q3] of the APT effect per scan session and between-session COV per ROI per metric.The COV of each participant was computed across two sessions performed on different days.

APT-weighted values in Session 1 (%) APT-weighted values in Session 2 (%) COV (%) APT-weighted values in Session 1 (%)
APT-weighted values in Session 2 (%) COV (%) acceptable.There was a trend of increasing COV from within-session to between-session and between-scanner for each CEST metric tested, going from very good within-session repeatability to good or acceptable betweensession and between-scanner reproducibility.This is likely explained by increasing variability between measurements done on separate days and separate scanners.For instance, in two sessions across one week, differences in body temperature and physiological level of protein can influence the magnitude of the CEST effect by affecting the fractional concentration of the solute protons 2 , as for instance shown in the liver after fasting 47 .However, in particular for the brain where fluctuations in physiology are expected to be limited, stronger effects on CEST signal are likely caused by differences in participant positioning (compared to the iso-center of the system, and leading to differences in shimming of the field of view) and scanner state or between-scanner set-ups.The different scanners included different head coils and system versions (32 channel coil and DV26 software environment for the MR750 scanner, 24 channel coil and MP26 software environment for the PET-MRI scanner).Moreover, the PET-MRI scanner has a smaller bore size, due to the presence of the PET-detectors.All this can influence the signal-to-noise ratio of images, and B 0 and B 1 fields, leading to differences in evaluated APT-weighted effects.
Based on the higher COV of between-scanner versus between-session reproducibility experiments indeed confirm that the influence of these scanner differences seems to be larger than body conditions.In comparing the three different APT-weighted CEST metrics, we found the smallest COV for LD APT, which consistently showed very good repeatability and reproducibility (all COV < 10%).This finding is in line with a previous study where COV < 10% within session in WM and GM was reported 14 .MTR REX provided good repeatability and reproducibility (< 20%) in most experiments.Small COV and better consistency in LD/MTR REX APT compared to MTR asym is likely a result of the way these APT metrics are computed.During the calculation of MTR asym , residual direct water saturation, magnetization transfer effects and nuclear Overhauser effects are not compensated for.Signal variation coming from those effects can decrease the repeatability/reproducibility of the MTR asym .The magnitude of MTR asym is usually smaller than LD and close to 0 (in tumor ~ 2%, healthy tissue ~ -1%).This effectively reduces SNR, which may be contributing to the additional variation of MTR asym across sessions, compared to LD/MTR REX .Both LD and MTR REX were calculated by subtracting a two-pool fitted Z-spectrum, in which DS and MT were fitted, from the acquired Z-spectrum data.The APT effects evaluated by LD and MTR REX were calculated based on this subtracted Z spectrum at 3.5 ppm only, such that the data acquired at -3.5 ppm was not included hereby avoiding NOE effects in the amide-weighted metrics.This approach of calculating LD and MTR REX , with minimizing effects of DS, MT and NOE, may be the reason for these metrics providing more consistent measurements than MTR asym .
In patients, the COV of MTR asym reported here is comparable with a recent study where 11-30% COV of MTR asym was found in glioma in between-session experiments, which is considered to fall within acceptable reproducibility 29 for quantifying and monitoring glioma.Note that both in our healthy volunteer and patient study, much lower COV for LD and MTR REX was found.For quantitative APT-weighted imaging in clinical practice, a small change of APT effect can impact the ability to differentiate tumors, such as an MTR asym difference of 0.5% between solitary brain metastasis and glioma 48 , and an MTR asym difference of 0.4% in the prediction of isocitrate dehydrogenase (IDH) mutation status in grade II gliomas 49 .We found between session differences in MTR asym APT to be ~ 0.3%.Thus, the use of advanced metrics LD/MTR REX is preferable for using APT-weighted CEST MRI in diagnosis, not only as these metrics have higher reproducibility, but also because they are likely able to provide better differentiation between tumors in clinical practice.
Additionally, with using LD/MTR REX it is feasible to investigate the changes in MTR at 3.5 ppm and − 3.5 ppm independently.While detection of increased amide-weighted MTR at 3.5 ppm has been of main focus for brain tumor imaging, evidence for decreases in NOE-weighted MTR in high grade brain tumors is increasingly reported and potentially useful for diagnostics and treatment follow-up 3,6,50 .This gives another reason why the use of MTR asym in brain tumor diagnostics may be suboptimal.
Another promising clinical application of APT-weighted CEST MRI is in early detection of true tumor progression after treatment of high grade gliomas.The COV found here for all three metrics are likely sufficient for this purpose, even for MTR asym .There are mostly retrospective or cross-sectional studies that have investigated ATPw-CEST MRI at a single time point after progression.The difference found between true progression and treatment effect is at minimum 200% (a two-fold higher value) in MTR asym for true progression in these retrospective, cross-sectional studies 51,52 .Moreover, in one longitudinal data set (albeit a small cohort) it is reported that there is stable MTR asym signal in progressive disease after surgery and radio-and chemotherapy with tumor averages of MTR asym after treatment varying between 3.5 and 7% compared to pre-treatment values, whilst there is an almost 70% decrease in MTR asym values for patients with treatment effect 53 .Our COV values for LD (between-session COV of approximately 5% for contrast enhancing tumor and 7.5% for tumor core) and even for MTR asym (20-24%) would still be sufficient to detect treatment related changes with the above expected effect size of, at minimum, 70%.
It should be noted that MTR asym can be acquired by obtaining only few off-resonance images making acquisition inherently faster, and more attractive for clinical application, than the need for acquiring a full z-spectrum required for LD and MTR REX calculations.This is why in the consensus on application of CEST MRI for brain tumor imaging, MTR asym currently is recommended 9 .However, this consensus recommendation includes doing B 0 field inhomogeneity correction, either with a separate, fast acquisition or within the CEST acquisition because of the detrimental effect B 0 fluctuations have on MTR asym .While this inherently adds to the acquisition time, advances in image acquisition and analysis are leading to more rapid scan times, not only enabling MTR asym acquisition with B0 correction, but also LD/MTR REX acquisition to become clinically feasible, as exemplified in our current scanning protocol, where all three metrics and B 0 correction can be obtained within one single, volumetric scan of fewer than 5 min acquisition time.
In this work, we performed between-scanner reproducibility measurements and provided evidence that the APT-weighted CEST effect can be reproduced well on two scanners.We also showed the feasibility of providing www.nature.com/scientificreports/consistent measurements in patients with brain tumors.However, we only included scanners from a single vendor, and our patient cohort was fairly small and heterogeneous.Due to wanting to keep the burden to our patient cohort low, we did not include between-scanner measurements for the patients.Moreover, we cannot fully rule out that there were changes in tumor physiology between the two measurements within the patients because of the highly proliferative nature of high grade tumors.Our future work is aimed at assessing repeatability and reproducibility of scanners from different vendors and at different hospitals, while extending the patient cohort.In particular when it comes to comparing between scanners from different vendors, it will be important to investigate effects from the unavoidable deviations in acquisition parameters and hardware.Such further assessments are essential for the field to deliver good between-session and between-scanner reproducibility, such that APT-weighted CEST can eventually become a quantitative imaging biomarker for clinical practice and multi-centre research trials including brain tumor imaging.

Conclusion
In summary, our study provides further evidence that APT-weighted CEST imaging is repeatable and reproducible in healthy brain and brain tumors across scan sessions and scanners at 3 Tesla.While MTR asym provides acceptable reproducibility, more advanced metrics (LD and MTR REX ) show much better reproducibility which is of importance when subtle differences in APT-weighted CEST are sought for clinical diagnosis or monitoring of brain pathology.Future work in translating APT-weighted CEST MRI for clinical application in brain tumor diagnostics should include measuring across different sites and different vendors to confirm APT-weighted CEST as a reproducible and quantitative imaging biomarkers for brain tumor imaging.

Figure 1 .
Figure 1.Description of reproducibility experiments, including within-session, between-session and betweenscanner reproducibility.For healthy volunteers, time between t = 1 and t = 2 was 7 days.For patients, the median time interval was 4 days.

Figure 2 .
Figure 2. Bland-Altman plots of average APT values in WM from each participant.It shows the reproducibility of three CEST metrics in WM within-session, between-session and between-scanner.

Table 2 .
Group median (interquartile range Q1-Q3) of COV for within-session, between-session and between-scanner analysis per ROI per APT-weighted CEST metric.